Enhanced clinical assessment of hematologic malignancies through routine paired tumor and normal sequencing

Genomic profiling of hematologic malignancies has augmented our understanding of variants that contribute to disease pathogenesis and supported development of prognostic models that inform disease management in the clinic. Tumor only sequencing assays are limited in their ability to identify definitive somatic variants, which can lead to ambiguity in clinical reporting and patient management. Here, we describe the MSK-IMPACT Heme cohort, a comprehensive data set of somatic alterations from paired tumor and normal DNA using a hybridization capture-based next generation sequencing platform. We highlight patterns of mutations, copy number alterations, and mutation signatures in a broad set of myeloid and lymphoid neoplasms. We also demonstrate the power of appropriate matching to make definitive somatic calls, including in patients who have undergone allogeneic stem cell transplant. We expect that this resource will further spur research into the pathobiology and clinical utility of clinical sequencing for patients with hematologic neoplasms.

Editorial Note: This manuscript has been previously reviewed at another journal that is not operating a transparent peer review scheme.This document only contains reviewer comments and rebuttal letters for versions considered at Nature Communications.Mentions of the other journal have been redacted.

REVIEWER COMMENTS
Reviewer #1 (Remarks to the Author): This is a resubmission of a manuscript previously reviewed at [Redacted].
In general, the authors have satisfactorily addressed most of the points made in the original review, with a few exceptions.
The authors should still show how often adding germline testing changes the ICC/WHO classification (asked for in the previous review) and not just IPSS-M scores.
The analysis of gnomAD MAF thresholds is stlll problematic.Using COSMIC as a gold standard for somatic mutations is fraught with error.Most of the COSMIC data comes from tumor only sequencing, and is full of rare SNPs that have been incorrectly designated as 'somatic' mutations.The AMP/ASCO/CAP guidelines for the 1% MAF cut-off is based on 2014 lab survey data, when molecular testing was still in its infancy-no one doing tumor only sequencing in 2023 uses a 1% MAF cut off.The new guidelines (under development) will suggest a much lower MAF cut-off.For that reason, showing the effects of lower MAF cutoffs (at least in the supplement) will be useful for readers going forward.
The authors argue that the lack of germline testing limits the ability to do MRD testing.This is not really correct.It's easy to identify germline variants in serial tumor only data, since the VAFs are stable at 50%.Finally, cheap, fast, reliable whole genome sequencing is rapidly on the way for clinical applications; the cost of a 30x genome on the new Illumina Novoseq X platform will be about 325 dollars, and will soon fall even further.Several recent papers have reported on whole genome workflows in clinical settings, which will ultimately replace all platform tests (since no private reagents are needed, no capture reagents, universal approaches that are not test site-specific, etc.).Exome and panel sequencing were developed a decade ago when sequencing was expensive, and they were never intended to be the long term gold standard.The authors don't even mention this approach, but they definitely should consider the movement towards WGS in the clinic in the discussion.
Reviewer #2 (Remarks to the Author): Expert in computational genomics In this manuscript, Drs Ptashkin, Ewalt, Zehir and Arcila describe their diagnostic genomic experience from a large prospective cohort of haematological malignancies.The authors do a comprehensive job of highlighting the difficulties of performing tumour-only genomic analysis, and the impact of different sources of matched normal DNA.Several important points are made regarding the precious nature of the samples and the desire to perform comprehensive genetic analysis from a single assay.The introduction of false positive somatic variants from the tumour-only analysis is well made and should form a valuable contribution for others seeking to conduct precision medicine studies in haematological conditions.I would classify all of my recommendations as minor.
Minor suggestions 1.The data presented in Figure 2 was incomplete for highlighting how the donor and host SNPs are identified, partitioned, and use to create a clean patient-specific profile.Consider also presenting the noisy CNV profile fit without subtracting the donor, and strengthening the confidence that the donor (and not the patient) germline variants are accurately removed.

Line 278, what triggers the flow cytometry analysis in the workflow?
3. Line 347, compound heterozygosity is usually the combination of two [small] mutations on different alleles, not the combination of mutation and LOH.I recommend this be described as hemizygosity, or more colloquially, as a second hit.4. Line 379, how many mutations minimum were required for this analysis.Provide (in silico) experimental justification for the choice of >12.9 Mut/Mb 5. Line 384: why did the authors not investigate HRD mutations, as I expect these would be prevalent in the cohort, and some studies suggest that there are several mutation signatures 6. Line 565, describe the characteristics of the panel.Total size, all exons, how much of the introns are included… 7. Line 575: was the accuracy study completed by researchers blinded to the expected results?8. Line 585: what is the LOD of CNVs?Is this the same for deletions and amplifications?9. Line 634: how are duplicate variants resolved in the variant union procedure?For example, variant VAF will likely differ between callers 10.Line 652: More methodological description of the FACETS2n algorithm is required, as this is a major component of the paper.The R package is well documented, and installs, but is lacking a LICENSE file, technically rendering it currently unusable.Please provide example data and supporting files so that the package can be rapidly tested 11.Line 737 The URL is dead, ensure this is active: https://www.cbioportal.org/study/summary?id=heme_msk_impact_2022 12. Line 742 looks like a URL but is not.
Reviewer #3 (Remarks to the Author): The authors respond to my concerns about the limited additional value of paired tumor-normal sequencing (vs tumor-only), by explaining that private SNPs or putative passengers may be of potential clinical relevance.Whilst this remains possible and the examples the authors cite are valid (tracking the tumor, mutational signatures etc), these are not demonstrated in their study.
So whilst the manuscript describes a robust and affective platform, the technical or clinical advances demonstrated here are limited.

Response to Reviewers
Reviewer #1 (Remarks to the Author): This is a resubmission of a manuscript previously reviewed at [Redacted].
In general, the authors have satisfactorily addressed most of the points made in the original review, with a few exceptions.
The authors should still show how often adding germline testing changes the ICC/WHO classification (asked for in the previous review) and not just IPSS-M scores.
The analysis of gnomAD MAF thresholds is stlll problematic.Using COSMIC as a gold standard for somatic mutations is fraught with error.Most of the COSMIC data comes from tumor only sequencing, and is full of rare SNPs that have been incorrectly designated as 'somatic' mutations.The AMP/ASCO/CAP guidelines for the 1% MAF cut-off is based on 2014 lab survey data, when molecular testing was still in its infancy-no one doing tumor only sequencing in 2023 uses a 1% MAF cut off.The new guidelines (under development) will suggest a much lower MAF cut-off.For that reason, showing the effects of lower MAF cutoffs (at least in the supplement) will be useful for readers going forward.
Response: We agree with the reviewer that the 1% MAF cut off is probably too relaxed.We have included the table of lower MAF cut-offs and the number of mutations that pass these filters as a supplemental table 2 and also attached below for reference.We have also included additional discussion in the manuscript around this point.We hope this data will be useful for the new guideline preparation.The authors argue that the lack of germline testing limits the ability to do MRD testing.This is not really correct.It's easy to identify germline variants in serial tumor only data, since the VAFs are stable at 50%.

gnomAD
Response: While we agree that it is possible to infer germline variants based on the longitudinal assessment of VAF's across several samples, based on our own experience, this may lead to many errors given that it remains an assumption.VAF can be affected by many factors, including alterations in gene copy numbers and variations related to coverage and technology.More importantly, in transplant patients this is simply not possible due to the variable proportions of both host and donor components.In our practice, we very often see patients who were sequenced in outside institutions, who come in with a list of "mutations" our clinical teams expect us to track and that prove to be germline events that were not filtered.The use of a normal control not only facilitates the assessment in large panels but enables the unequivocal monitoring of tumor specific events that would not be possible otherwise, even in the context of transplant.
Finally, cheap, fast, reliable whole genome sequencing is rapidly on the way for clinical applications; the cost of a 30x genome on the new Illumina Novoseq X platform will be about 325 dollars, and will soon fall even further.Several recent papers have reported on whole genome workflows in clinical settings, which will ultimately replace all platform tests (since no private reagents are needed, no capture reagents, universal approaches that are not test sitespecific, etc.).Exome and panel sequencing were developed a decade ago when sequencing was expensive, and they were never intended to be the long term gold standard.The authors don't even mention this approach, but they definitely should consider the movement towards WGS in the clinic in the discussion.
Response: We concur with the reviewer that whole-genome sequencing (WGS) technology will have a place in the clinical management of patients with hematological malignancies.However, there is still a long way to go before this becomes a reality outside of a few select academic centers or large clinical research organizations.This is due to the costs associated with sequencing platforms and the technical expertise required to analyze WGS data for decision-making purposes.Platforms like Oxford nanopore could close this gap even further, and large language models like GPTs could enable easier interpretation of results in the future.WGS sequencing will absolutely require a matched normal sample to confidently identify somatic alteration spectrum This has also been addressed in the discussion.
Reviewer #2 (Remarks to the Author): Expert in computational genomics In this manuscript, Drs Ptashkin, Ewalt, Zehir and Arcila describe their diagnostic genomic experience from a large prospective cohort of haematological malignancies.The authors do a comprehensive job of highlighting the difficulties of performing tumour-only genomic analysis, and the impact of different sources of matched normal DNA.Several important points are made regarding the precious nature of the samples and the desire to perform comprehensive genetic analysis from a single assay.The introduction of false positive somatic variants from the tumour-only analysis is well made and should form a valuable contribution for others seeking to conduct precision medicine studies in haematological conditions.I would classify all of my recommendations as minor.
Minor suggestions 1.The data presented in Figure 2 was incomplete for highlighting how the donor and host SNPs are identified, partitioned, and use to create a clean patient-specific profile.Consider also presenting the noisy CNV profile fit without subtracting the donor, and strengthening the confidence that the donor (and not the patient) germline variants are accurately removed.

:
We thank the reviewer for this suggestion and have added an additional figure to the manuscript as eFigure4.This figure highlights the genome-wide allele-specific copy number profile of a post-transplant patient obtained using (a) the baseline host germline reference and (b) the intersection of heterozygous SNPs between baseline host and baseline donor germline reference samples.Using a baseline reference from only one individual results in the inability to accurately identify regions of allelic imbalance due to host-donor chimerism in the post-transplant setting.By utilizing baseline germline reference samples from both the host and donor, we are able to confidently identify regions of allelic imbalance, including copy neutral loss of heterozygosity.